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Abstract 
This randomized controlled study examines the effects of an intervention, Fusion Reading, on 
the reading achievement and motivation of adolescent struggling readers. Fusion Reading was 
implemented in grades 6 through 10 in four middle schools and three high schools from three 
districts in the southeast and western Michigan. Eligible struggling readers were assigned 
randomly to the Fusion Reading intervention or a “business as usual” control condition which 
did not include additional reading instruction. Intervention students received the 
multicomponent, strategy-based Fusion reading intervention from trained teachers for one class 
period, 5 days a week, for a school year. Results indicated a statistically significant impact for 
the intervention on the Sight Word Efficiency subtest of the Test of Word Reading Efficiency 
(TOWRE), with an effect size (Glass A) of 0.11. Results are discussed in the context of previous 
reading intervention research findings with adolescents. 
Keywords: randomized controlled trials, adolescent struggling readers, reading achievement, 


Fusion Reading intervention 
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The need for schools to focus on literacy during the adolescent grades is greater than ever. In 
2011, 66% of fourth-grade and 70% of eighth-grade students were reading below proficiency on 
the National Assessment of Education Progress (NAEP) (National Center for Education 
Statistics, 2011). As many as one-third of fourth-grade students and nearly a quarter (24%) of 
eighth-grade students were reading even below the basic level. Low levels of literacy, as 
demonstrated by the NAEP, are alarming given reading demands inherent in academics and in 
the workforce (e.g., reading for comprehension, processing text to acquire and use information) 
(Kamil et al., 2008; Levy & Murnane, 2004). It is imperative that adolescents who struggle to 
read have access to effective, evidence-based interventions to succeed in school and in life. 

Significant research has been conducted to understand and contribute to reading practice in 
the early grades (e.g., Reading First, No Child Left Behind), and although there are a plethora of 
reading interventions for adolescents, relatively few interventions have been subjected to 
rigorous tests (Kamil et al., 2008; Shanahan, 2006). In response, the U.S. Congress authorized 
funding for Striving Readers in 2006 and in 2009 through discretionary grants administered by 
the U.S. Department of Education. 

The Striving Readers initiative was intended to increase adolescent literacy levels in Title I- 
eligible schools and to build a strong, scientific research base by identifying and replicating 
strategies that improve adolescent literacy skills. To participate, students had to be reading at 
least 2 years below grade level. The current study of the Fusion Reading intervention is one of 
eight projects funded in 2009 that spanned 2 years—l1-year each of planning and of research. 


Although Fusion Reading was designed as a 2-year reading intervention, the current study 
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reports on findings for | year of Fusion Reading implemented under the Striving Readers 
initiative.’ 
Research on Adolescent Reading Interventions 

Two frameworks on how skilled readers read contribute to the design of Fusion Reading. 
From the framework of the “Simple View of Reading,” skilled reading requires both word-level 
reading skills and linguistic comprehension (Gough & Tunmer, 1986; Hoover & Gough, 1990). 
Weakness in either component impairs reading. Hock et al. (2009), for example, found that over 
60% of the 8" and 9" grade students reading at or below the 40th percentile struggled with all 
examined domains of reading, including decoding, word recognition, vocabulary, fluency, and 
comprehension. From a metacognitive framework, skilled reading is strategic in the sense that 
proficient readers are purposeful in what and how they read (Baker & Brown, 1984), and enlist a 
repertoire of strategies to construct meaning from text (Paris, Wasik, & Turner, 1991). Thus, 
strategy-based interventions targeting multiple outcomes, including comprehension, are likely to 
be most effective for the adolescent reader. 

Two recent meta-analyses provide evidence that strategy interventions targeting multiple 
outcomes tend to be highly effective. Edmonds and colleagues (2009) conducted a meta-analysis 
with 17 experimental and quasi-experimental studies for students between 6th and 12th grades 
and found that the two largest effect sizes were for interventions targeting reading 
comprehension (1.23) and multiple-component interventions targeting multiple outcomes, 
including reading comprehension (0.72). Although not as large as the other outcomes, Edmonds 
also found a moderate effect size for word study interventions (0.34), but not for fluency 


interventions (-0.03). 


' The original intent was to fund the projects for 4 years (i.e., 1 planning and 3 implementation years). Funding was not 


authorized by U.S. Congress to support the remaining years of the study. 
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Scammacca et al. (2007) examined 16 studies in addition to Edmonds et. al’s studies. 
Whereas Edmonds study weighted standardized measures more heavily than non-standardized 
outcome measures in their effect size composites, Scammacca was more specific in reporting 
effect sizes by outcome measure type, and reported them as follows: 1) all outcome measures, 2) 
standardized outcome measures, 3) all reading comprehension measures, and 4) standardized 
reading comprehension measures. The largest effects for all standardized outcome measures 
were found for word study (0.68), comprehension (0.56), and multiple-component interventions 
(0.41). Fluency had the lowest effect size (0.04). When examining effect sizes for standardized 
reading comprehension measures (rather than more general standardized reading measures), 
Multiple-component interventions had the largest effect size (0.59), followed by comprehension 
(0.54) and word study (0.40) interventions. Again, fluency interventions had a small and 
negative effect size (-0.07). Additionally, they report that for standardized reading 
comprehension outcome measures, the strongest effects were found for middle schools students 
(0.47) versus high school students (0.14). Researcher-implemented interventions also had a 
comparatively large effect size (1.08) in comparison with teacher-delivered interventions (0.21) 
on all standardized outcome measures. Edmonds (2009) and Scammacca’s (2007) meta-analyses 
suggest that for adolescents, the most effective interventions are ones that target either 
comprehension or multiple areas of reading. 

Strategy-based interventions. Strategy-based interventions aim to teach students procedures 
or steps for solving problems while reading and understanding text (e.g., identifying unfamiliar 
words, decoding words) (Mayer, 1987). Strategies may be cognitive in nature (e.g., paraphrasing, 
questioning), metacognitive (e.g., comprehension monitoring), or behavioral (e.g., using a 


dictionary to look up words) (Almasi, 2003). The ultimate goal is not to “teach strategies,” but 
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rather to teach students to be strategic in their reading. That is, they are skilled in making 
decisions about which procedures should be implemented while reading text. Strategy-based 
programs differ in the way in which strategies are presented to students, such as the number of 
strategies or the ways in which strategic knowledge is acquired through direct or indirect 
teaching methods (Lenz & Hughes, 1990). 

Many of the multistrategy interventions available today are founded in single strategy 
interventions that have been developed and tested throughout the past three decades. One 
example of such a single-strategy intervention is the Word Identification Strategy, which aims to 
increase decoding of common, multisyllabic words through a 7-step mnemonic process for 
identifying words (Lenz & Hughes, 1990). Using a single subject design, Lenz and Hughes 
found notable decreases in the number of word identification errors. This Word Identification 
strategy eventually became part of other multistrategy interventions such as found in Xtreme 
Reading (Schumaker et. al., 2006) and Fusion Reading (Brasseur, Hock, & Deshler, 2010a , 
2010b, 2010c; Hock, Brasseur, & Deshler, 2010a, 2010b, 2010c, 2010d, 2010e). 

In the aforementioned meta-analysis by Edmonds and colleagues (2009), they note that 
several of the experimental, single-strategy intervention studies they reviewed yielded large 
effect sizes, particularly strategies that teach how to use graphic organizers (ES = 1.68 for 
producing relational statements) and strategies that teach how to find main ideas in text (ES = 
2.23 for producing main idea statements). They point out, however, that in many of these studies 
used outcome measures that were researcher-developed and closely aligned with the very 
learning strategies they intended to measure. Gersten et al. (2001) also noted in their review of 
reading comprehension research for students with learning disabilities that, although single- 


strategy interventions tended to have greater impacts for certain skills (e.g., reorganizing 
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expository text, self-questioning, using a mapping organizer, and direct instruction for 
summarizing text), transfer could not be confirmed due to the small number of studies. Thus, the 
degree to which single-strategy interventions may be generalized is questionable. 

Over the last few decades there has been a noticeable movement away from single-strategy 
approaches (Pressley, Harris, & Marks, 1992) and a movement toward creating and using 
multistrategy interventions. Often these multistrategy interventions build on single-strategy 
interventions by combining strategy instruction and taking a more flexible approach to teaching 
(Gersten et al., 2001). One example of a multistrategy approach is reciprocal teaching, in which a 
dialogue occurs between teachers and students about the text in the context of summarizing, 
question generating, clarifying, and predicting (Palinscar & Brown, 1984, 1986). In isolation, 
summarizing, questioning, clarifying, and predicting likely all are effective; however, in 
multistrategy interventions, it is the integration of the singular strategies in a meaningful way 
that distinguishes the intervention as a formal intervention and not a compilation of individual 
strategies. 

Several veins of research have aimed to develop strategy-based interventions to improve 
reading outcomes. One such example is Deshler et al.’s (2002) Learning Strategies Curriculum 
(LSC). The LSC aims to improve adolescent reading achievement through strategy use and 
increased motivation for reading using a standardized eight-step instructional process that covers 
three strands (Cantrell, Almasi, Carter, & Rintamaa, 2011; Cantrell, Almasi, Carter, & Rintamaa, 
Madden, 2010; Center for Research on Learning, n.d.). The three strands include information 
acquisition, studying information once it is acquired, and expression of oneself through writing. 
A recent randomized control trial study of the LSC with sixth- and ninth-grade students found 


that in the first year of the study, there was a significant, medium effect size for 6" grade 
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students (0.22), but not for 9 grade students (0.08) (Cantrell et al., 2010). Across all three years 
of the study, however, the impacts yielded a small, significant effect (0.08) for sixth-grade 
students and a small, but significant, effect size (0.15) for ninth-grade students on a standardized 
measure of reading comprehension (Cantrell et al., 2011). Effects were stronger for students in 
general education (0.19) than special education (-.07, ns) and a significant effect of the LSC on 
motivation for reading for both sixth- and ninth-grade students (0.11) across all three years. 
Unfortunately, evaluations of other similar, strategy-based interventions are scarce. 

A Description of Fusion Reading 

Fusion Reading is a supplemental reading intervention designed for middle and high school 
students who score at least 2 years below grade level on standardized reading measures 
(Brasseur, Hock, & Deshler, 2010a , 2010b, 2010c; Hock, Brasseur, & Deshler, 2010a, 2010b, 
2010c, 2010d, 2010e). It builds upon and shares the evidence behind the work of the Strategic 
Instruction Model’s Learning Strategies Curriculum and Xtreme Reading’ by integrating some of 
the same strategies, focusing on reading, and extending the time frame from | to 2 years in 
duration. 

Fusion is a fully developed instructional package with a specific curricular scope and 
sequence of high-leverage reading strategies within a framework focused on strategies for 
teaching comprehension and vocabulary, and for increasing motivation for reading. The 2-year 
scope and sequence, instructional routines, and materials are described below. The developers 
(Brasseur, Hock, & Deshler, 2010a , 2010b, 2010c; Hock, Brasseur, & Deshler, 2010a, 2010b, 
2010c, 2010d, 2010e) recommend no more than 15 students per class. Struggling students are 
enrolled in the supplemental intervention for one class period for 5 days a week. 


? Xtreme Reading is a 1-year, supplemental intervention founded in strategic instruction (Shumaker et al., 
2006). 
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Curriculum scope and sequence. There are nine units and a student project: (1) the 
Establish the Course unit provides students with rationales for the course, an overview of course 
content, and expectations for classroom management; (2) the Prediction Strategy unit explicitly 
teaches students the first reading comprehension strategy, and students learn how to preview 
reading selections, link prior knowledge to the subject, make predictions and inferences about 
content, and evaluate reading to answer student generated questions and predictions; (3) the 
Possible Selves unit surfaces long-term future goals and establishes action plans that link 
attainment of personal goals to reading proficiency; (4) the Bridging Strategy unit provides 
instruction in advanced phonics, decoding, word recognition, and reading fluency; (5) the 
Strategy Integration unit teaches students how to integrate prediction, bridging, and vocabulary 
strategies and provides students with opportunities to apply integrated strategies to reading 
content area textbooks; (6) the Summarization Strategy unit teaches students to summarize small 
sections of books, chapters, and some longer passages; (7) the Strategy Integration unit 
continues teaching and providing opportunities for students to practice integrating strategies and 
applying them to reading; (8) the PASS the Test unit teaches students a reading strategy they can 
use to do well on standardized tests; (9) the Advanced Strategy Integration unit continues 
teaching and providing opportunities for students to practice integrating strategies and applying 
them to reading; and (10) students do a final Fusion Reading intervention project to apply the 
reading strategies they have learned. 

Instructional routines. Routines take place as follows. (1)Warm-Ups (3 to 5 minutes): 
students are engaged in an activity at the beginning of class to provide a connection to class 
readings and key strategies; (2) Thinking Reading (5 to7 minutes): a structured process in which 


the teacher demonstrates expert reading behaviors provides an opportunity for students to 
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participate in the process; (3) Explicit Instruction (30 minutes): for each strategy, teachers 
describe, explain, and model specific metacognitive steps of the strategy; students verbally 
practice the steps of the strategy and practice using the strategy first with materials at their 
instructional level and later with increasingly difficult materials; students receive elaborated 
feedback from the teacher until they gain proficiency and are able to use the strategy in a 
generative way and apply the strategy to assignments in a wide variety of materials and settings; 
(4) Vocabulary (10 minutes): explicit vocabulary instruction follows a seven-step vocabulary 
process; and (5) Wrap-up (5 minutes): students review the lesson. 

Teacher manuals. Fusion lesson formats were provided for either 90-minute block or 54- 
minute class schedules and include multiple instructional activities such as whole class explicit 
instruction, guided practice, partner practice, and teacher-led individualized instruction. Each 
lesson plan comes with a 1-page overview that includes learning objectives; a lesson-at-a-glance 
chart with approximate time needed for each activity and a short description of activities for the 
lesson and required materials; an example lesson script for each lesson that consists of a detailed, 
step-by-step process model of the lesson with both written and visual cues; and the materials 
necessary to teach the lesson, such as strategy cue cards, reading passages, assessment score 
sheets, and progress charts and graphs. Progress assessment forms and answer sheets are 
provided at the beginning and end of each Fusion Book, and formative assessment activities are 
available during partner and individual practice sessions throughout each unit. 

Student workbooks. Workbooks are available for The Bridging Strategy, Prediction 
Strategy, and Summarization Strategy. Age appropriate trade novels and short stories (e.g., 
Bluford High School Series from Townsend Press) and over 110 short expository passages 


(about 400 words each) are included in Fusion materials. 
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Research Questions and Logic Model 

The purpose of the current study was to evaluate the effectiveness of the 2-year Fusion 
Reading intervention after 1 year of implementation for 6th- through 10th-grade students in three 
public school districts in Michigan.* Specifically, the study addressed the following: 

1. What are the intent-to-treat impacts of the Fusion Reading intervention on the reading 
outcomes and motivation to read of struggling readers after receipt of 1 year of the 
intervention? 

2. For which students are the interventions most and least effective? 

3. In what ways are implementation factors associated with impacts (or lack of impacts) on 
reading and motivation outcomes? 

For the evaluation of the Fusion Reading intervention, we created a logic model that frames 
the research questions and hypotheses and guides the choice of implementation and student 
outcome measures and analysis approach. 

<Figure 1 here> 

The model distinguishes between short- and long-term student outcomes, as the model 
hypothesizes that there are mediating student outcomes (e.g., coverage of the content and 
attendance-related outcomes) that affect long-term outcomes (e.g., increased motivation to read, 
increased reading proficiency on standardized outcome reading measures). The intervention 


inputs include professional development provided to administrators and teachers during face-to- 


7 Although there is some promising evidence that Fusion Reading is effective when used with adolescent students, the 
studies are limited by the study design (e.g., a pre-post design, nonequivalent groups at baseline for a quasi-experimental design 


or only one school included in the experimental study). 
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face workshops, in class coaching and coaching via video chats with the developers,, and student 
assignment to Fusion classes. 
Methods 

Study Design 

Four middle schools and three high schools from three districts in the southeast and western 
suburban areas of Michigan participated in the Fusion Reading Intervention Study in the 2010-11 
academic year. Blocking on schools and grade level, students in grades 6 through 10 were 
randomized to either the intervention or control condition. Both Fusion and control students 
participated in regular English language arts (ELA) classes at their school. However, students in 
the intervention condition received Fusion Reading as a supplemental reading intervention in the 
2010-11 school year, whereas students in the control condition engaged in nonliteracy activities. 
Both Fusion and control students participated in regular English language arts (ELA) classes. 
This study examined the intent-to-treat effect of the Fusion Reading intervention on student 
reading achievement, including performance on the state accountability test and motivation, as 
well as the treatment-on-the-treated effect of fidelity of implementation on student outcomes. 


Participants 


Michigan State Department of Education recruited schools for this study by inviting seven 
schools based on district and school improvement goals, the school need to improve the reading 
skills of its students, and their willingness to participate in a randomized control study at the 
student level. The seven participating schools ranged in their enrollment from 400 to 1,400 
students. The percentage of students eligible for free or reduced-price lunch ranged from 51% to 


96%. Across the seven schools, the percentage of students reading below proficiency on the 2009 
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Michigan Educational Assessment Program (MEAP) reading test ranged from 26% to 61%, with 


an average of 42%. 


Determining student eligibility. Students in each of the schools had to meet the following 
criteria to be eligible to participate in this study. They had to score between the Sth to 35th 
percentiles on the Test of Silent Contextual Reading Fluency (TOSCRF).* Students were 
excluded if they were (a) identified as a student with a severe cognitive disability, or (b) a 
Level-1 English language learner (ELL),° or (c) a recipient of any other reading interventions as 
required by an IEP. Of the 2,109 students screened, 871 students were found to be eligible for the 


study (41.2%). 


Fusion teachers. Schools followed district procedures to hire nine intervention teachers. 
Fusion developers from the University of Kansas Center for Research on Learning (KU-CRL) 
submitted a description of the necessary skills and knowledge most often required by teachers 
who had been successful implementing the Fusion Reading intervention to each school principal 
to guide their hiring decision. Two teachers were hired for each of the two large high schools and 
one teacher was hired for each of the other schools. Throughout the year, the KU-CRL trained 
the teachers to deliver the intervention via face-to-face professional development and on-site 
coaching. These nine teachers taught Fusion to the treatment students, and did not provide any 
4 The Test of Silent Contextual Reading Fluency (TOSCRF) is a nationally normed reading fluency test with test- 

retest reliability ranging from .85 to .88 across four different test forms. Moderate criterion validity has been 
established for TOSCRF, as the correlations between TOSCRF with Wechsler Intelligence Scale for Children- 
Third Edition, 1991, Stanford Achievement Test Series-Ninth Edition, 1996, or Woodcock Johnson III, 2001 are 


reported to be.50, .56, and .68 respectively. It is quick to administer and has been used in a number of research 


studies (Hammill, Wiederholt, & Allen, 2006). 


wn 


Level-1 ELLs are ELLs with limited formal schooling who recently arrived in school, have not been assessed with 
the Michigan English Language Proficiency Tests or other placement tests, or ELLs who have preproduction or 


early production English skills. 
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literacy services to control students in the participating schools. The original nine teachers were 
female, 44 % were white and 56% were African American. Their average years of experience 
were 9 years. Seven of the nine teachers had certification in ELA, and two had a reading 
specialist certification; five had master’s degrees, and two had bachelor’s degrees. All teachers 
had previous experiences teaching struggling readers. Of the nine teachers hired, one resigned in 
December, was replaced initially by a substitute teacher, and then replaced by a teacher in 
February who resigned in early April after the Striving Reader funds for Year 2 were cut. 

Teacher training. In Year 1, Fusion trainers provided 9 days of professional development 
for teachers plus, on average, 40 hours of coaching for teachers; and 2 days for administrators 
with a 2-day orientation session in late spring of planning year (teachers and administrators using 
print and video materials. In Year 1, teachers were trained to use the following strategies: 
Establish the Course, Prediction Strategy, Possible Selves Motivation Strategy; Strategy 
Integration I, and an introduction to the Summarization Strategy. Students in the intervention 
condition received daily instruction on these reading strategies. Fusion Reading intervention was 
scheduled between 48 and 60 minutes daily for schools on a semester schedule and between 70 
and 73 minutes daily for schools on a trimester schedule. 

Parent consent. Schools informed parents about their student’s potential for participation in 
the study by sending them a passive consent form in both English and Spanish prior to 
randomization. One student’s parents refused participation prior to randomization and four 
refused participation after randomization. 

Random assignment procedure. Eligible struggling readers with parent consent were 
randomly assigned to either Fusion Reading or a non-literacy, elective course. We constructed 


strata on the basis of the following two factors: school (seven levels) and grade levels (grade 6, 
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7, or 8 for middle schools and grade 9 or 10 for high schools). Students within each stratum were 
randomly assigned to Fusion or control conditions, resulting in 367 students in the treatment 
condition and 389 in the control condition (see Figure 2). Stratifying techniques can effectively 
remove 90% of the bias due to the stratifying variables and ensure the number of treatment and 
control group students are closely balanced within each stratum (Shadish et al., 2002). 
<Figure 2> 

Random assignment monitoring. Random assignment was monitored in three ways: (1) 
The Fusion Reading teachers were required to enter attendance data for their students each day. 
When changes were made to classroom assignment, teachers notified the evaluation team of the 
change and the reason for the change (e.g., student moved). In instances where the reason was 
unknown, the principal or school contact was asked to investigate the matter and place the 
student back in the class when possible. (2) Class rosters were also requested from the Fusion 
Reading teachers every quarter or trimester so that researchers could follow up on discrepancies 
from original classroom composition. (3) School rosters and class schedules were obtained at 
least twice per year for all students to confirm that the control students remained at the school 
and that neither treatment nor control students were participating in any literacy-related courses. 
Similar procedures of contacting the school contact were followed for any discrepancies with the 
treatment or control conditions. 
Student Outcome Measures 

The researchers hired and trained local data coordinators to collect student outcome data. 
Local data coordinators were blind to students’ experimental condition at the time of assessment. 


Three tests were administered at pretest and posttest, as described below. 
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TOWRE. The TOWRE has two subtests: Sight Word Efficiency (SWE) and Phonetic 
Decoding Efficiency (PDE). SWE measures the number of real printed words that a student 
accurately reads within 45 seconds (Torgesen, Wagner, & Rashotte, 1999). PDE measures word 
recognition skills, which count the number of pronounceable, printed nonwords that a student 
can accurately decode within 45 seconds (Torgesen et al., 1999). The correlations between SWE 
and PDE with DIBEL nonsense word fluency measure are 0.73 and 0.75, respectively (Hagan- 
Burke, Burke, & Crowder, 2006). PDE also has a correlation of 0.85 with the Word Attack 
subtest of the Woodcock Reading mastery Tests-Revised (WRMT-R) and a correlation of 0.89 
with the Sight Word efficiency subtest and the Word Identification subtest of the WRMT-R. We 
used the standard scores for each subtest in the analysis. 

GRADE. The Group Reading Assessment and Diagnostic Evaluation (GRADE) is a norm- 
referenced, standardized diagnostic reading test. The GRADE for secondary students includes 
sentence comprehension, passage comprehension, vocabulary, and listening comprehension 
(Williams, 2001). The GRADE is an untimed test, but adolescents usually complete it in about 
40 minutes. The Passage Comprehension subtest requires the participant to read graded passages 
and to respond to comprehension questions. The Sentence Comprehension subtest uses a cloze 
task in which the student reads a sentence and chooses the appropriate word for a blank space. 
The Vocabulary subtest assesses decoding and word-level understanding. Students are presented 
a short phrase or sentence with a target word followed by five choices. This study did not 
administer the listening comprehension subtest. The GRADE has high internal reliabilities, 
ranging from 0.95 to 0.99 for each form, level, and grade group. In addition, the GRADE 


correlates at 0.86 to 0.90 with the Gates-MacGinitie Reading tests, at 0.82 to 0.87 with the 
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reading tests of the California Achievement Test and at 0.69 to 0.83 with the Iowa Test of Basic 
Skills. We used the standard scores for each subtest in the analysis. 

MEAP reading. Michigan’s MEAP reading achievement test is a criterion-referenced high- 
stakes accountability test administered to students in grades 3 through 8. States set student 
performance standards, often called annual yearly progress, by which schools are held 
accountable. The reliability of MEAP reading tests ranges from 0.81 to 0.90 across grades 
(Chianca & Coryn, 2006). MEAP reading test reports scale scores and performance levels for 
each student. Since the MEAP reading test is not vertically aligned, different grades cannot be 
compared directly. Therefore, MEAP reading scores were converted to Z-scores for sixth- and 
seventh-graders using population mean and standard deviation for each grade level as reported 
by MDE. Z scores indicate the number of standard deviations above or below the state average 
for a particular grade (Rivkin, Hanushek, & Kain, 2005). This study used Z-scores as the MEAP 
reading outcome variable in the analysis. 

Motivation measure. Student motivation for reading was assessed with the Children’s 
Academic Intrinsic Motivation Inventory (CAIMI) Reading subtest (Gottfried, 1998). The 
CAIMI was developed to measure enjoyment of learning in the subject areas of reading, math, 
social studies, science, and school in general. For older adolescents, the coefficient alphas for the 
subject area subscales range from 0.93 to 0.95; coefficient alpha for the school in general 
subscale is 0.91 (Gottfried, 1998). Subject area subscales and the school general subscale are 
similar for elementary and middle school students. Thus, the instrument has substantial internal 
consistency. Construct validity of the CAIMI has also been reported throughout the schools years 
(Gottfried, 1985, 1986, 1990, 1998). CAIMI reports T scores and percentile scores. This study 


used T scores from the CAIMI reading subtest for the analysis. 
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Fidelity of Implementation Measures 

Curriculum coverage. According to the developers of Fusion Reading, teachers should 
cover at least 5 of the 10 lessons within two semesters or two trimesters of teaching the 
curriculum (both of which contain approximately the same amount of allocated classroom time) 
in Year 1 of the 2-year intervention:® (1) Establish the Course, (2) Prediction Strategy, (3) 
Possible Selves, (4) Strategy Integration I, and (5) Bridging Strategy. Curriculum coverage 
proportion was calculated as the number of strategies covered by the teacher relative to the total 
number of lessons teacher should have covered during the first year of Fusion implementation. 

Fusion dosage rate. The developers (Brasseur et al., 2010; Hock et al., (2010) suggested that 
to maximize the full potential of the Fusion Reading intervention, classes need to be scheduled 
daily, students should attend at least 80% of the allocated class time and class size, schools 
should on average enroll 15 students per class. In order to better understand whether the amount 
of exposure to the curriculum influences growth in reading, the researchers developed an online 
data system for teachers to enter the actual time students attended classes in lieu of school 
attendance records, which would not capture the time students left class early for various 
extracurricular activities. Eight of the nine teachers recorded the amount of time each student 
attended Fusion class each day. Records were not available from the teacher that left midyear. A 
student’s dosage rate was calculated as the proportion of time a student was present in class 
relative to the total number of allocated class minutes. 


Statistical Analysis 


® The schedules of the schools differed: two schools were on a trimester schedule, and the remaining schools 
scheduled classes two semesters in a year. Developers adapted the Fusion pacing schedule to accommodate both 


types of school schedules. 
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Intent-to-treat analysis (ITT). ITT is the average effect of the treatment based on the initial 
treatment assignment regardless how many participants actually received the treatment.’ The ITT 
analyses present the impact of assignment of Fusion instead of the impact of Fusion on students 
who received to Fusion. The ITT impact estimate is the expected effect of Fusion when it was 
implemented in the real world, with less than perfect teacher implementation and student dosage. 
Hierarchical linear modeling (HLM) was performed to take into account of students nested in 
schools. Dependent variables were reading achievement measures (TOWRE and GRADE) and 
reading motivation (CAIMI). Independent variables included a constant, pretest scores, 
demographic characteristics, and treatment indicator. The HLM model for treatment effects is as 
follows: Y, = By) + Bo, fusion+ 8, (COV, —COV...)+e, + Mo,., where Y, is outcome of 
student i in school k at posttest. Fusion indicates initial random assignment with | for 
intervention and 0 for comparison. The coefficient 2), associated with Fusion in the above HLM 
model indicates the average treatment effect in promoting improved student outcomes. COViz are 


the covariates (pretest and demographic characteristics) and they were centered by mean. /,, are 
coefficients associated with each covariate. e, is student random effect, and 4, is school 


random effects. 

Sensitivity analyses were conducted to check the robustness of the impact of Fusion across 
different specifications of the models. HLM was conducted on the data with and without imputed 
independent variables controlling for demographic characteristics (i.e., gender, race, grade level, 
and disability status), which leaves four sets of models for each outcome. A dummy variable 


adjustment imputation approach was used, which sets the missing pretest scores to zero and adds 


7 Although Fusion Reading intervention program is intended as a 2-year intervention, due to funding changes this 


study only lasted for a single year. The focus of the analysis is on 1-year effect of this intervention 
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a dummy variable to indicate missing in the impact model (Puma, Robert, Stephen, & Cristofer, 
2009). 

Effect sizes are reported as Glass’s A (Glass, 1977) and calculated by dividing the 
intervention indicator coefficient by the standard deviation of control group. The improvement 
index (What Works Clearinghouse, 2008) is also reported, which translates the effect size into an 
improvement in percentile rank. The improvement index indicates the expected change in 
percentile rank for the median comparison students if that student had received the Fusion 
Reading intervention. 

The analysis of the outcomes is organized by following the WWC beginning reading and 
adolescent literacy domain classifications (What Works Clearinghouse, 2010). The TOWRE 
SWE and PDE are two measures within the alphabetics domain. GRADE sentence 
comprehension, passage comprehension, and vocabulary are in the comprehension domain. 
MEAP reading belongs to general reading achievement domain. To limit the family-wise false 
discovery rate, the Benjamini-Hochberg (BH) approach was utilized to correct for multiple 
comparisons (Benjamini & Hochberg, 1995) across multiple outcomes within the same 
domain(What Works Clearinghouse, 2008). For example, in the alphabetics domain, which 
includes two alphabetic measures (TOWRE SWE and PDE), the BH approach was used to 
control the false discovery rate at 0.05 (Benjamini & Hochberg, 1995). 

Since in previous studies, a similar multistrategy-based intervention worked to improve 
reading skills of sixth-graders but not ninth-graders (Cantrell, Almasi, Carter, Rintamaa, & 
Madden, 2010). This study conducted exploratory analysis on whether Fusion improved the 
achievement of students in certain grade levels but not others. HLM models were used to 


estimate the impact of Fusion for students in sixth through tenth grade. 
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Treatment-on-the-treated analysis (TOT). Although the ITT analyses suggest the average 
effect of an intervention, it does not yield the effect of the intervention for those students who 
actually received the intervention. This study used two approaches to estimate the effect of 
treatment on the treated. The first is the instrumental variable approach. Because random 
assignment is correlated with the fidelity of implementation measures (since control students 
have a value of zero for each implementation measure) but uncorrelated with the error term in 
the outcome equations, the treatment assignment indicator variable works as an instrument to 
represent fidelity of implementation (Gennetian, Morris, Bos, & Bloom, 2005). A two-stage, 
least-square model was executed to estimate the TOT. The first-stage is 


Fidelity, = By) + By, Fusion+ B,,(COV,, — COV...) +e, , where there were three sets of fidelity 


measures: (1) percentage of curriculum completed, (2) percentage of time a student participated 
in Fusion classes over the course of a year, or (3) whether a student participated 80% or more of 
the Fusion classes in a year which is the recommended dosage by developer of Fusion Reading. 
The second-stage equation is Y,, = fy) + 4), predFidelity,, + B),(COV, —COV...)+e, + Mo, 
which regressed student outcome on the predicted value of fidelity from the first 


stage. f,, associated with predicted fidelity is the estimated TOT of Fusion. 


The second approach used propensity score methods to select comparison students for the 
high student dosage group and for the low student dosage group. The logic of the propensity 
score methods was to select control students that, based on baseline measures of pretest scores, 
reading motivation, and demographic characteristics, would have had a similar chance of 
attending Fusion classes 80% or more of the time, but did not (Unlu et al., 2010). Seven 
parametric and nonparametric propensity score methods (exact, stratified, nearest neighbor, 


optimal, full, genetic, coarsened exact matching) were executed using an R package, MatchIt 
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(Ho, Imai, King, & Stuart, 2007). We applied multiple quantitative and graphic methods to 
examine the relative merits of all seven methods and selected the full method because it achieved 
the best baseline equivalence on covariates and also preserved the largest sample size. See 
Appendix, Table A-2, for details of the model selection. After the comparison students were 
selected, the difference between the high curriculum covered group and their matched 
comparison group was estimated on each of the student outcomes. The same analyses were also 
conducted for the low student dosage group (< 80%) and their matched comparison students on 
all outcomes. 
Results 

Attrition Analysis 

Although randomizing students to conditions should result in statistically equivalent groups, 
higher overall level of attrition and differential attrition between treatment and control groups 
may jeopardize the initial balance and impact estimate may be biased (What Works 
Clearinghouse, 2008). Data analysis began with an attrition analysis. Treatment group attrition 
rate was 24%, control group attrition rate was 25%, and the differential attrition rate was 1%. 
According to the WWC standards (2008), the overall and differential attrition rate is low for this 
study. 
Baseline Equivalence Analysis 

After the attrition analysis, a descriptive analysis was conducted for students in the analytic 
sample, those who had both pretest and posttest. Table 1 presents the student background 
characteristics (gender, race, or disabilities), pretest scores, and baseline equivalence test results 
of the participants in the intervention and comparison groups. For continuous student outcome 


variables, the statistical significance of the difference between the two groups at baseline was 
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determined from HLM analysis. For dichotomous demographic variables, statistical significance 
was determined by using a Chi-square test. Fusion participants were not significantly different 
from control students on demographics or baseline reading measures; however, a non-significant 
difference favored control students on CAIMI (¢ = -1.89, p = .059. 

<Table 1> 
Intent-To-Treat Analysis Results 

Primary estimates of the Fusion impacts were derived from the ITT analyses. Regardless of 
the level of curriculum coverage and student Fusion dosage rate, these analyses compared all 
students who were randomly assigned to Fusion (who were intended to receive the treatment) to 
those who were randomly assigned to the control condition. 

Table 2 demonstrates that the Fusion Reading intervention was successful in improving sight 
word efficiency and sentence comprehension skills of students who were randomly assigned to 
receive Fusion classes as compared with those who were assigned to control condition. Fusion 
students had significantly higher TOWRE SWE (p < 0.05, effect size = 0.10) and GRADE 
sentence comprehension (p < .05, effect size = 0.15) at posttest than comparison group students 
under Model D. Fusion students achieved an average percentile ranking that was approximately 
4 or 6 percentile points higher than the ranking of the average student in the comparison group 
on SWE and sentence comprehension, respectively. The effect of Fusion on the TOWRE SWE is 
the only outcome that remained significant after BH correction. No other student outcomes were 
found to have a statistically significant effect. 

<Table 2> 

Sensitivity analyses were conducted to examine whether the effect of the Fusion reading 


intervention was consistent across different model specification (Model A through D). The 
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Fusion effect was consistent across four different model specifications. For example, the 
significant positive effect of Fusion on TOWRE SWE persisted regardless of whether or not we 
controlled for student background characteristics or whether or not we imputed pretest scores. 
The positive effect of Fusion on GRADE sentence comprehension was also consistent across the 
four models. 

To understand the overall impact findings, an exploratory subgroup analysis was performed 
to determine if Fusion had an impact for students in grades that aligned with previous research. 
Grade-level differences were found. 

Table 3 shows the impact of Fusion for each grade on each student outcome. Fusion students 
in grade 6 had higher SWE scores (p < .05, effect size = 0.17, improvement index = 7) and 
higher MEAP reading scores (p < .10, effect size = 0.15, improvement index = 6) than control 
students in the same grade. Fusion students in grade 8 had higher GRADE sentence 
comprehension scores (p < .10, effect size = 0.72, improvement index = 26) than control students 
in the same grade. Fusion participants in grade 9 scored higher on their motivation to read (p < 
.05, effect size size = 0.34, improvement index =13) but scored lower on PDE skills (p < .05, 
effect size = -0.20, improvement index = -8), as compared with control students in the same 
grade. Note that due to the small sample size for the subgroup analyses, these subgroup analyses 
are underpowered. 

<Table 3> 
Treatment-On-the Treated Analysis Results 

We hypothesized that the impact of Fusion on reading outcome was dependent upon the 
amount of curriculum covered by the teachers and by the dosage. Table 4 documents Fusion 


teachers’ curriculum coverage and Fusion students’ dosage rate. In Year 1, Fusion teachers, on 
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average, covered 73% of the curriculum and, only 33% of them met the criterion set by the 
developer— 80% or more of the curriculum should be covered by the teacher. In Year 1, Fusion 
students attended 73% of the Fusion classes, and only of a 57% of them met the required dosage 
criteria set by the developer—telative to the allocated time for the class, students attended 80% 
or more of the class time. For the sample of students who had both MEAP data and fidelity data 
(n = 229), none of the treatment teachers reached the 80% curriculum coverage criterion, and 
only 60 Fusion students reached the required 80% dosage rate. Therefore, an insufficient sample 
was available to conduct the treatment-on-the treated analysis using the MEAP reading outcome. 

<Table 4> 

Two approaches were used to examine the effect of Fusion on the students who actually 
received the treatment: (1) instrumental variable and (2) propensity scoring. Both methods are 
applied in the field, and the results from one method can verify the results from the other method 
(Table 5). For the instrumental variable approach, the first stage F statistics were all significant at 
the p < .001 level, suggesting that treatment assignment was a valid instrument for the 
implementation variables. Fusion students whose teachers covered more Fusion curriculum 
achieved higher scores on TOWRE SWE (p < .05) and GRADE sentence comprehension (p < 
.05) than students whose teachers covered less Fusion curriculum. A one standard-deviation 
increase in the curriculum coverage rate increased the intervention effect on SWE scores by a 
0.02 standard deviations and on GRADE sentence comprehension scores by a 0.04 standard 
deviations. Results also suggested statistically significant effects of students’ Fusion dosage rate 
on GRADE sentence comprehension. A one standard-deviation increase in Fusion dosage 
resulted in a 0.06 standard deviation increase on GRADE sentence comprehension. Students with 


an 80% or more dosage rate outperformed those with less than 80% dosage rate by 0.29 standard 
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deviation on GRADE sentence comprehension (p < .05, effect size = 0.29).° 

<Table 5> 

The results using propensity scoring methods to select control students who were similar to 
students in the high Fusion dosage group indicated a 0.11 standard deviation improvement on 
TOWRE SWE (p <..05, effect size = 0.11, improvement index = 4). As compared with similar 
students in the control condition, students with less than an 80% dosage rate showed a 0.08 
standard deviation increase in TOWRE SWE (p < .05, effect size = 0.08, improvement 
index = 3). Beyond the ITT effects of Fusion Reading intervention, the instrumental variable 
approach and propensity scoring approach both suggested a substantial mediating effect of 
student Fusion dosage rate and TOWRE SWE. The results of TOT and ITT results are 
confirmatory, as both indicated a strong effect of Fusion on improving students’ sight word 
efficiency skills. 

Discussion 
We designed this randomized controlled trial to rigorously study the effects of the Fusion 

Reading intervention on adolescent students. Stratifying by school and grade, 581 students who 
read between the 5" and 35th percentile on a standardized measure and attended four middle and 
two high schools were randomized to a treatment or non-literacy comparison condition. After 
implementation of one-year of the two-year reading intervention, students in the treatment 
condition significantly outperformed students in the comparison classes on a standardized word 
reading efficiency measure, the TOWRE, with an effect size of 0.11. Effect sizes on the 
GRADE sentence comprehension outcome were notable, although mean differences between 


treatment-control groups fell short of significance after adjusting for multiple comparisons. 


8 Mean of 137 Fusion students with an 80% dosage rate was 7.62 with SD of 4.05. Mean of 102 Fusion students 
with less than an 80% dosage rate was 6.91 with SD of 3.28. 
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Considering the rigor of the study design and analysis, the challenge of positively impacting 
standardized reading outcomes with older struggling students, particularly with a teacher- 
implemented intervention (versus a researcher-implemented intervention) and in only one year of 
the two year intervention, the study results are encouraging. 

Results align with previous research on grade level effects. First, middle school students 
are more likely to benefit from one year implementation of multi-strategy reading interventions 
than students in grades 9 and 10 on a standardized reading outcome measure. Although 
underpowered and therefore considered an exploratory analysis by grade, 6" grade Fusion 
students outperformed comparison students on word reading efficiency and Michigan’s state 
accountability measure in reading, and eighth-grade Fusion students outperformed comparison 
students on a standardized sentence comprehension measure. Similarly, Cantrell et al. (2010) 
examined the effects of the Learning Strategies Curriculum (LSC) on 6" and 9th grade students 
reading comprehension after one implementation year, finding impacts for 6" grade students but 
not the 9" graders on a standardized measure of reading comprehension, as measured by the 
GRADE. Developed by the University of Kansas’ Center for Research on Learning, the LSC is 
a strategy based curriculum that includes the same strategies to Fusion Reading such as word 
identification, paraphrasing, vocabulary, and summarizing, but also expands the strategies to 
assist students with written expression. Our results are also consistent with Scammacca et. al. 
(2007) who found that interventions (which tended to be mulitstrategy) yielded larger effect sizes 
for middle school students (0.47) than high school students (0.14) on standardized reading 
measures. The consistent finding of differential effects of reading interventions that tend to be 
strategy-based across middle and high school grades suggest that strategy-based interventions 


might be more helpful in improving reading skills of middle school students than high school 
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students in the short term. This highlights the challenges in raising the achievement levels in high 
schools. Future research and policy needs to examine new ways to raise reading skills of high 
school struggling readers. 

Although teacher implemented effect sizes are often smaller than researcher implemented 
interventions (Scammacca et al., 2007), the magnitude of the effect size for teacher 
implementation of Fusion on word reading was comparable to Scammacca’s et. al findings (0.11 
versus 0.21, respectively) for standardized reading outcomes for teacher-delivered interventions. 
In general, this study aligns with Slavin et. al.’s (2008) study that reports large scale studies 
produce lower effect sizes than small studies (0.15 versus 0.36). 

Finally, this study is one of the first studies that report promising findings on the mediating 
effect of fidelity of implementation (teacher curriculum coverage and student Fusion dosage), an 
often a neglected independent or mediating variable from a study design, analysis or report 
(Faggella-Luby & Deshler, 2009). As hypothesized in the logic model, we examined the extent 
to which curriculum coverage and dosage would be associated with improved student reading 
outcomes. The findings suggested that students whose teachers covered 80% or more of the 
curriculum and received 80% or more dosage scored significantly better on sight word efficiency 
and sentence comprehension than students who covered less of the curriculum and received less 
dosage. These results indicated the importance of fidelity of implementation in large-scale 
evaluation studies: only a high level day-to-day fidelity of implementation by teachers and 
students is likely to yield desirable student outcomes. These findings also provided empirical 
evidence of the desired or recommended levels of implementation for schools who are currently 


implementing Fusion Reading intervention. 
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Limitations. Despite the encouraging finding of the evaluation of one year effect of Fusion 
Reading, several study features limit generalization. First, the Fusion Reading intervention was 
begun as a two year intervention, but the program was terminated after one year due to U.S. 
Department of Education cuts in funding. Future research will need to examine the impact of this 
reading intervention as a two-year intervention. Second, Fusion Reading added instructional time 
in reading for the treatment group. Although we found a significant positive effect of Fusion 
Reading on improving SWE skills of struggling readers, this increase could be due to the 
students’ opportunity to read more. Third, although the evaluation team made efforts to minimize 
contamination between treatment and control condition (i.e., Fusion teachers signed 
confidentiality statement to restrict teaching only to Fusion students; researchers and Fusion 
teachers closely monitored Fusion classes roster), we were not able to control for the possibility 
of Fusion students sharing reading strategies with control students outside of Fusion classes. 

Conclusion. Stronger research designs with standardized measures typically yield more 
reliable estimates of a treatments effect and may have greater value for informing practice than 
less rigorous designs, as designed and implemented in this study of Fusion Reading. 

Fusion Reading, was engineered in light of 1) necessary and sufficient conditions for 
successful reading by focusing on word identification and reading comprehension (Gough & 
Tunmer, 1986); and 2) research demonstrating the advantages of cognitive and metacognitive 
strategy instruction (Kamil et al., 2008; Scammacca et al., 2009; Slavin et. al., 2008). After one 
year of implementation of a two year intervention with adolescent struggling readers, word 
reading outcomes were significantly improved with an intervention that explicitly taught 
vocabulary, paraphrasing and word study strategies along with motivation strategies (e.g., setting 


goals and reading text relevant for the age group). Our analysis on the mediating effects of 


FUSION READING INTERVENTION 31 


fidelity of implementation emphasized the importance of meeting the developer’s 
implementation guidelines in achieving desirable student reading outcomes. Only future research 
will allow us to fully understand whether the intended two year intervention will improve 


struggling adolescent’s reading comprehension outcomes. 
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Table 1 


Baseline Equivalence Tests of Fusion and Control Students on Demographic, Reading Achievement, and Reading Motivation 


for the Analytic Sample 
Treatment Control x ort 2) 

Variable M (SD) or % N M (SD) or % N 

Male 53,71 152 56.55 164 0.47 0.494 

African American 81.27 230 80.34 233 0.08 0.778 

Hispanic/Latino 7.07 20 6.21 18 0.17 0.679 

White 10.25 29 12.07 Se) 0.48 0.489 

Learning Disabilities 12.37 35 13.79 40 0.26 0.613 

Any Disability 9.19 26 8.97 26 0.009 0.926 

TOWRE SWE 89.63 279 89.73 290 0.21 0.836 
(9.50) (9.90) 

TOWRE PDE 84.54 Z49 84.84 290 0.21 0.835 
(14.17) (14.78) 

GRADE Passage Comprehension 10.33 278 10.54 284 -0.23 0.530 
(4.25) (4.73) 

GRADE Vocabulary 88.29 281 87.41 287 1.07 0.287 
(11.22) (12.31) 

MEAP Reading -0.87 117 -0.88 135 O.11 0.913 
(0.67) (0.70) 

CAIMI Reading 48.01 267 49.57 275 -1.89 0.059 
(11.13) (11.04) 


Note. Standard deviations for continuous variables are in parentheses. 
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Table 2 

Overall Intent-To-Treat Impact Analysis of Fusion on Student Reading Achievement 

Outcome Treatment Control Estimated Effect Improvement 
Measures Model-Adjusted M SD N M SD N Impact Size Index P 
TOWRE SWE 

Model A 90.16 9.64 279 89.06 10.50 290 1.10 0.11 4.38 0.022 
Model B 90.17 9.64 279 89.06 10.50 290 1.11 0.11 4.38 0.021 
Model C 90.06 9.63 283 89.04 10.46 297 1.02 0.10 3.98 0.035 
Model D 90.07 9.63 283 89.04 1046 297 1.03 0.10 3.98 0.033 
TOWRE PDE 

Model A 85.33 14.18 279 85.26 14.23 290 0.07 0.005 0.20 0.909 
Model B 85.34 14.18 279 85.26 14.23 290 0.08 0.006 0.24 0.893 
Model C 85.27 14.16 283 85.21 14.16 297 0.06 0.004 0.16 0.927 
Model D 85.28 14.16 283 85.21 14.16 297 0.07 0.005 0.20 0.915 
GRADE Sentence Comprehension 

Model A 7.74 3.83 277 7.26 3.65 284 0.48 0.13 5.17 0.078 
Model B Tels 3.83 277 7.26 3.65 284 0.51 0.14 Dod. 0.061 
Model C rae 3.81 285 7.21 3.63 296 0.51 0.14 meow 0.055 
Model D hike) 3.81 285 21 3.63 296 0.54 0.15 5.96 0.043 
GRADE Passage Comprehension 

Model A 11.62 5.18 278 11.56 4.96 284 0.06 0.01 0.40 0.865 
Model B 11.61 5.18 278 11.56 4.96 284 0.05 0.01 0.40 0.904 
Model C 11.63 5.13 286 11.56 5.05 296 0.07 0.01 0.40 0.851 
Model D 11.61 5.13 286 11.56 5.05 296 0.05 0.01 0.40 0.901 
GRADE Vocabulary 

Model A 89.19 10.76 281 88.98 11.36 287 0.21 0.02 0.80 OFT 
Model B 89.17 10.76 281 88.98 11.36 287 0.19 0.02 0.80 0.795 
Model C 89.05 10.78 287 89.12 11.30 296 -0.07 -0.006 -0.24 0.928 
Model D 89.02 10.78 287 89.12 11.30 296 -0.10 -0.009 -0.36 0.892 
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MEAP Reading 

Model A -0.72 
Model B -0.71 
Model C -0.73 
Model D -0.72 
CAIMI Reading 

Model A 49.08 
Model B 48.98 
Model C 49.05 
Model D 48.98 


0.70 
0.70 
0.69 
0.69 


10.85 
10.85 
10.78 
10.78 


117 
117 
118 
118 


267 
267 
213 
273 


-0.78 
-0.78 
-0.80 
-0.80 


48.96 
48.96 
48.82 
48.82 


0.72 
0.72 
0.74 
0.74 


11.24 
11.24 
11.33 
11.33 


135 
135 
138 
138 


213 
275 
283 
283 


0.06 
0.07 
0.07 
0.08 


0.12 
0.02 
0.23 
0.16 


0.08 
0.10 
0.09 
0.11 


0.01 
0.002 
0.02 
0.01 


3.19 
3.98 
3.59 
4.38 


0.40 
0.08 
0.80 
0.40 


0.359 
0.300 
0.299 
0.243 


0.880 
0.983 
0.777 
0.840 


Note. There were no missing data on demographic variables. Estimated impact is the coefficient associated with Fusion treatment variable from the HLM model; 
Effect size = Estimated impact/SD of the control group; Model adjusted treatment group mean = Estimated impact + Mean of the control group; Model A= HLM 
impact models controlling for pretest without imputation for missing pretests; Model B = HLM impact model controlling for pretest and demographic variables 
without imputation for missing pretests; Model C = HLM impact model using the dummy variable adjustment approach for imputing missing pretest scores 
(Puma, Robert, Stephen, & Cristofer, 2009). This approach sets the missing pretest scores to a constant and adds a dummy variable to indicate missing in the 
impact model. Model D = HLM impact model using imputed pretest scores and control for pretest and demographic variables. 
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Table 3 


Intent-to-Treat Effect Size of Fusion on Student Outcomes Across Grade Levels 


GRADE GRADE 

TOWRE TOWRE Sentence Passage GRADE MEAP CAIMI 
Grade Levels SWE PDE Comprehension Comprehension Vocabulary Reading Reading 
6th grade O7* 0.08 -0.02 -0.11 -0.05 0.157 0.05 
Treatment N/Control N 96/98 96/98 93/92 93/92 95/96 74/79 95/97 
7th grade -0.03 0.06 0.17 0.06 0.02 -0.07 -0.09 
Treatment N/Control N 45/63 45/63 44/63 44/63 44/62 43/56 43/62 
8th grade 0.05 0.05 0.72T 0.20 0.24 - -0.37 
Treatment N/Control N 12/19 12/19 12/19 12/19 12/19 11/18 
9th grade 0.06 -0.20* 0.06 0.08 -0.001 - 0.34* 
Treatment N/Control N 79/60 79/60 81/60 82/60 83/60 73/50 
10th grade 0.08 O12 O31 -0.03 0.01 - -0.14 
Treatment N/Control N 47/50 47/50 47/50 47/50 47/50 45/48 


tp < 0.10, +p <.05. **p < .01. ***p < 001. 
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Table 4 
Description of Fusion Reading Fidelity of Implementation Measures for Fusion Students 


in the Analysis Sample 


Variables Mean (%) SD N 
Teacher Level 
Curriculum Coverage 73.00 18.65 9 
Proportion of Teachers with 80%+ Curriculum Coverage 33.00 49.24 9 
Student Level 
Fusion dosage rate 72.69 28.31 241 
Proportion of Students with 80%+ Dosage 57.26 49.58 241 
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Table 5 


Treatment-on-the-Treated Effect of Fusion on Student Outcomes 


GRADE GRADE 
TOWRE TOWRE Sentence Passage GRADE CAIMI 
Method Variables SWE PDE Comprehension Comprehension Vocabulary Reading 
IV Curriculum 0.01* 0.0007 0.008* 0.002 0.0008 0.004 
Coverage Rate (0.007) (0.009) (0.004) (0.005) (0.01) (0.01) 
Effect Size 0.001 0.00005 0.002 0.0004 0.00007 0.0004 
Vis 0.96 0.73 0.30 0.27 0.40 0.33 
Fusion Dosage 0.01 -0.007 0.008* 0.002 -0.002 0.009 
Rate (0.007) (0.009) (0.004) (0.005) (0.01) (0.01) 
Effect Size 0.001 -0.00005 0.002 0.0004 -0.0002 0.0008 
i 0.69 0.74 0.32 0.27 0.39 0.34 
80%+Dosage 1.31 -0.92 1.06* 0.28 -0.28 1.13 
(0.88) (1.16) (0.49) (0.68) (1.37) (1.40) 
Effect Size 0.12 -0.06 0.29 0.06 -0.02 0.10 
R? 0.69 0.74 0.31 0.27 0.39 0.34 
PS High Dosage Ll -0.56 0.47 -0.35 -0.23 0.33 
(80%+dosage) (0.44) (0.88) (0.33) (0.43) (0.69) (0.74) 
Effect Size 0.11 -0.04 0.13 0.07 0.02 0.03 
Low Dosage 0.847 -0.71 0.34 0.19 -0.04 0.27 
(80% - Dosage) (0.49) (0.55) (0.49) (0.40) (0.61) (1.58) 
Effect Size 0.08 -0.05 0.09 0.04 -0.004 0.02 


Note. IV = instrumental variable approach; PS = propensity score methods. For the IV model, all first stage F statistics were statistically significant at the 0.001 
level. Coefficients and robust standard errors (in parentheses) are presented. All the models controlled for pretest, gender, race, grade level, and disability. 
tp< 0.10, *p < .05. *kp < 01. kp < .001. 


Table A-1 


Appendix A 


HLM Analysis of the Impact of Fusion on Student Reading Achievement Using the Analytic Sample 


TOWRE TOWRE GRADE Sentence GRADE Passage 
SWE PDE Comprehension Comprehension GRADE Vocabulary MEAP Reading CAIMI 
Fixed Model Model Model 
Effect Model A | Model B =Model A — Model B= Model A B Model A Model B_ =Model A — Model B A B Model A — Model B 
Intercept  89.14*** 94. 58*** 84 A7#RE 87 74HHRE 7 3 *R* 5.53*** 11.46*** 11.01***  89.45*** 93 42**H  - - 48.52*** 44 56*** 
(0.51) (2.07) (0.39) (2.74) (0.52) (1.76) (0.34) (1.64) (1.13) (4.39) 0.78*** = 3.02*** (0.61) (2.73) 
(0.05) (0.48) 
Pretest O.85*** = 0.83*** — 0.83 *** 0.82 O.38*** — 0.36*** — O.55*** — .52*** — .53*** = 0. 49*** — 66 0.64 0.53*** == 0.52*** 
(0.03) (0.03) (0.02) (0.02) (0.03) (0.03) (0.04) (0.04) (0.03) (0.03) (0.05) (0.05) (0.04) (0.04) 
Fusion 1.10* 1.11* 0.07 0.08 0.487 0.517 0.06 0.05 0.21 0.19 0.06 0.07 0.12 0.02 
(0.48) (0.48) (0.63) (0.63) (0.27) (0.27) (0.37) (0.37) (0.75) (0.74) (0.07) (0.07) (0.80) (0.80) 
Male 0.60 -0.04 0.66* -0.16 0.74 -0.04 -2.11** 
(0.48) (0.64) (0.27) (0.38) (0.74) (0.07) (0.80) 
Black -0.85 -1.26 -0.22 -0.26 -1.81 0.01 1.645 
(0.76) (1.01) (0.44) (0.60) (1.20) (0.12) (1.23) 
Hispanic -1.42 -3.63*** -0.49 -0.08 -1.71 0.19 -1.70 
(1.15) (1.52) (0.65) (0.89) (1.78) (0.18) (1.90) 
Grade -0.63** -0.13 0.20 0.13 -0.28 0.11 0.51 
Level (0.24) (0.32) (0.20) (0.19) (0.51) (0.07) (0.31) 
Disability -0.68 -1.91 -1.57 -1.71*** -4.93*** -0.23 0.64 
(0.76) (0.98) (0.42) (0.57) (1.14) (0.11) (1.19) 
Random Effect 
School 1.00 0.83 0.82 1.45 1.637 2.557 0.30 0.59 6.96F 9.09 - - 0.34 0.57 
(0.84) (0.91) (0.98) (1.44) (1.05) (1.89) (0.37) (0.66) (4.58) (8.84) (0.86) (1.34) 
Residual  32.57*** = 32.29*** 56.71*** S6.15*** 10.31*** 9,99%** 19 ADEE OTTER — 78AS5*E 76. OR*HE  - - 86.84*** — 84.99% ** 
(1.95) (1.94) (3.39) (3.37) (0.62) (0.61) (1.17) (1.16) (4.69) (4.59) (5.32) (5.23) 


tp <.10. *p < .05. *kp < 01. #&kp < 001. 
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Table A-2 


Comparison of Seven Propensity Score Methods on Selecting Comparison Group for the Treatment-on-the-Treated Analysis for 


Treatment Students with High Implementation on TOWRE SWE Outcome 


48 


Male Black Hispanic Grade Level Disability TOWRE Pretest CAIMI 
PS N of Nof Tot Mean SDof Mean SD of Mean SDof Mean SDof Mean SDof Mean SDof Mean SDof 
Methods treatment control alN _ Diff control Diff control Diff control Diff control Diff control Diff control Diff control 
No 
matching 133 275 408 0.04 0.50 0.02 0.40 0.01 0.25 0.002 1.54 -0.03 0.35 0.60 9.96 -2.74 11.04 
Exact 1 1 2 0 NA 0 NA 0 NA 0 NA 0 NA 0 NA 0 NA 
Subclass 133 275 408 0.03 NA 0.04 NA 0.01 NA 0.11 NA 0.02 NA 1.15 NA 0.31 NA 
Nearest 133 133 266 0 0.49 -0.02 0.37 0 0.26 -0.03 1.59 0.008 0.32 1 9.96 -0.07 10.63 
Optimal 133 133 266 ~=0.03 0.50 0.02 0.40 0.008 0.28 -0.03 1.5834 0.03 0.29 -0.02 10.68 0.008 10.74 
Full 133 275 408 0.03 NA 0.01 NA -0.01 NA -0.11 NA 0.02 NA 0.65 NA 0.09 NA 
Genetic 133 101 234 0 0.49 0 0.39 0 0.27 0 1.54 0 0.33 -0.02 8.54 0 9.90 
Kernel 132 274 406 0.01 0.49 0.004 0.38 0.003 0.26 0.02 1.56 0.002 0.32 0.08 9.81 -0.76 10.02 
Coarsened 
Exact 
Matching 54 76 130 0 0.47 0 0.26 0 0.19 0 1.54 0 0.26 -0.08 6.91 -0.71 8.61 


Note. Gender, race, grade level, disability, pretest scores, and motivation to read scores were used as predictors in the PS model to generate the propensity scores. 
Kernel method is preferable because it generates a large sample size and achieves better baseline equivalence particularly on TOWRE pretest than other methods. 
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Population of Students 
n =2109 
1 no parent consent, Eligible students Ineligible students 
115 in waitlist N = 871 n =1238 


Attrition at pretest Attrition at pretest 
Randomized to Fusion Randomized to control n=29 
n =367 n= 388 , |15 dropped out 
14 no pretest 


Attrition at posttest |, Participating Participating Attrition at posttest 
n=70 students at baseline students at baseline 

18 moved n=349 n =359 

41 no posttest 

11 other reason | 


Participating students Participating students 
at posttest n = 279 at posttest n = 290 


61 no posttest 


Figure 2. 
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Figure captions: 


Figure 1. Logic model for Fusion Reading Intervention. 


Figure 2. Overview of the flow of research participants through screening, randomization, 
consent procedures, and data collection of the Fusion Reading Intervention randomized 


controlled trial. 
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